\documentclass[10pt,twocolumn,letterpaper]{article}

\usepackage{cvpr}
\usepackage{times}
\usepackage{epsfig}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amssymb}

% Include other packages here, before hyperref.

% If you comment hyperref and then uncomment it, you should delete
% egpaper.aux before re-running latex.  (Or just hit 'q' on the first latex
% run, let it finish, and you should be clear).
\usepackage[pagebackref=true,breaklinks=true,letterpaper=true,colorlinks,bookmarks=false]{hyperref}


\cvprfinalcopy % *** Uncomment this line for the final submission

\def\cvprPaperID{****} % *** Enter the CVPR Paper ID here
\def\httilde{\mbox{\tt\raisebox{-.5ex}{\symbol{126}}}}

% Pages are numbered in submission mode, and unnumbered in camera-ready
\ifcvprfinal\pagestyle{empty}\fi
\begin{document}

%%%%%%%%% TITLE
\title{Project Status Report: Tiny Motion Capture System using Wii Remote}

\author{Wei Zhang, Yi Gong\\
Department of Computer Science\\
University of California, Santa Barbara\\
{\tt\small wei, ygong@cs.ucsb.edu}\\
% For a paper whose authors are all at the same institution,
% omit the following lines up until the closing ``}''.
% Additional authors and addresses can be added with ``\and'',
% just like the second author.
% To save space, use either the email address or home page, not both
%\and
%Second Author\\
%Institution2\\
%First line of institution2 address\\
%{\small\url{http://www.cs.ucsb.edu/~wei}}
}

\maketitle
\thispagestyle{empty}

%%%%%%%%% ABSTRACT
%\begin{abstract}

%\end{abstract}

%%%%%%%%% BODY TEXT
\section{Introduction}
Motion capture technique is a well studied area in the past decades, 
many commercial systems are available in the market. However, most of them 
are complicated and very expensive, built only for professionals.

The Nentindo Wii console has a built-in IR sensor in its remote
controller, with on-board hardware processing capability to track
up to four IR markers simultaneously. This give us the possibility
to build the cheapest motion capture system ever, 
and if consider that Wii has
already been sold to over 20M families, 
our system will only cost them only a few IR markers. 
Such kind of low-cost and tiny motion capture system can be used as a basis 
to build many new applications for home interactive entertainment.

Several interesting application of Wii remote has been discovered 
and result in great impact, the most famous ones are the head tracking
and multi-touch whiteboard developed by Johnny at CMU\cite{Johnny07}.
However, all of current applications only use a single Wii remote for 
its built-in 2D tracking capability.

In our project we are going to build a 3D tracking system using two Wii
remote, this is a classic 3D reconstruction from multiple view problem 
which has been explored by many researchers. For the 2 view cases, 
a lot of studies have been made on the fundamental 
matrix and calibrated or uncalibrated image marching
\cite{luong95},\cite{Higgins87} \cite{Zhang95}~\cite{Hartley95}. 
\cite{Hartley03} is also a good book that cover these issues. 
In our method, we will calibrate our camera by \cite{Zhang00}'s 
algorithm first, then calculate the fundamental matrix 
and reconstruct 3D refer to \cite{Faugeras92}.
%------------------------------------------------------------------------

\section{Our Approach}

%------------------------------------------------------------------------

\subsection{Hardware Platform}
Our system consist of two Wii remote as IR cameras, and four IR leds as markers.
Every Wii remote contains a 1024x768 infrared camera 
with built-in hardware on-board processing to provide 
tracking of up to 4 points (e.g. IR led lights) at 100Hz. 
In addition, Wii remote can be connected to the PC via bluetooth. 
Our marker is just a regular IR led attached to a AA battery, 
the most sensitive wavelength for Wii remote is between 800 and 950 nm.
%-------------------------------------------------------------------------

\subsection{Softawre Infrastructure}

WiiRemoteLib\\
Camera Calibration
Stereo Matching\\
Triangulation\\
Demo Application\\

%-------------------------------------------------------------------------

\section{Stereo Matching}
The stereo matching problem consists in finding pairs of detected 
markers in different images captured at the same time with different 
viewpoints such that each pair corresponds to the projections of the
 same scene point. Unlike common commercial motion capture systems 
using a cloud of markers, we use only four, so most of time the basic
 technique for spatial correspondence will work just fine. But when there 
are some special cases happen, for example, two epipolar lines constructed
from image1 are very close in image2, or two points in image2 are too close
to the same epipolar line, then temporal correspondence information can 
be used to improve the correctness of matching.

\subsection{Spatial Correspondence}
We will first attach four LEDs on a square plane, 
then use the OpenCV''s cvCalibrateCamera2 and 
cvFindExtrinsicCameraParams2~\cite{Zhang00} 
functions to get our camera''s intrinsic and extrinsic parameters.

The camera calibration will be done automatically 
in the initial phase of running our application.

In the 3D reconstruction step, we will find the 
correspondence between points pairs on the two image 
plane by the algorithm based on Epipolar Geometry. 
From the definition of Epipolar Geometry, 
we can see that the projections of point p onto 
two image plane, p1 and p2, have the relationship 
that:$$p^{T}Fp'^{-1} = 0$$\cite{Faugeras92}

where F is a 3 x 3, rand 2 matrix, which is also 
called fundamental matrix. Fundamental matrix has 
proven to be related with essential matrix E:$$E = t\times R$$, 
where R and t is the rotation and translation from first camera to second camera.

Since we''ve get our cameras calibrated, the calculation of 
the fundamental matrix becomes straightforward. After 
getting the fundamental matrix, we can relate the 
tracked points in two images from the two cameras. 
After finding the correspondence between the points 
in two image planes, calculating the 3D position of marks is not hard to handle. 
%-------------------------------------------------------------------------

\subsection{Temporal Correspondence}
The temporal correspondence problem (tracking) involves matching 
two sets of 3D points representing detected markers at two consecutive
 frames, respectively. Given the correspondence between consecutive
 frames, a time series of 3D coordinates is built.

In order to match points in image to objects, we must first estimate 
where the objects are supposed to be. Since our system are used on low-speed
human motions, the mathematical model of object movement can be simulated with a 
uniformly accelrated rectilinear motion. Based on this model, there exist 
many object tracking and position estimation algorithm, we plan to use the
classic Kalman filter to calculate the estimation.

So our stereo matching algorithm can be described as follows: For each 
time {\em t}, we calculate the position estimation for every points based 
on its previous position and estimation of speed. Then we project the estimated
position to two images, match them with the points in images to build the 
correspondency between points and physical objects.
Finally, we project the points in image1 to epipolar lines in image2, 
and match the points in image2 to its corresponding lines.
%-------------------------------------------------------------------------

%\section{Conclusion}
\section{Planned Experiments}

The experiment will based on the real motion from a human body. It will be divided into two part. First, arm motion. One person will bind four LEDs on the left upper arm, left lower arm, right upper arm and right lower arm, respecitvely. Then move his arms with slow/median/fast speed to see whether the system can capture the motion and figure out the spatial and time correspondence correctly. Similarly test case will be taken on the legs to capture walking, running and jumping.

The expected result is that our system should work well when get perfect input -- no occlusion and motion is normal speed. We will try to strengthen our application by estimate points' position, and tolerate error of detected points data by march the most likely points when searching correspondence. These kind of experiments of robustness is not predictable yet. But we want it can handle short and partial occlusion and have the ability of error tolerance. 
%-------------------------------------------------------------------------


{\small
\bibliographystyle{ieee}
\bibliography{egbib}
}

\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage
\rule{0pt}{1pt}\newpage

\end{document}
